Novel Risks of Unfavorable Corticosteroid Response in Patients with Mild-to-Moderate COVID-19 Identified Using Artificial Intelligence-Assisted Analysis of Chest Radiographs

The prediction of corticosteroid responses in coronavirus disease 2019 (COVID-19) patients is crucial in clinical practice, and exploring the role of artificial intelligence (AI)-assisted analysis of chest radiographs (CXR) is warranted. This retrospective case–control study involving mild-to-moderate COVID-19 patients treated with corticosteroids was conducted from 4 September 2021, to 30 August 2022. The primary endpoint of the study was corticosteroid responsiveness, defined as the advancement of two or more of the eight-categories-ordinal scale. Serial abnormality scores for consolidation and pleural effusion on CXR were obtained using a commercial AI-based software based on days from the onset of symptoms. Amongst the 258 participants included in the analysis, 147 (57%) were male. Multivariable logistic regression analysis revealed that high pleural effusion score at 6–9 days from onset of symptoms (adjusted odds ratio of (aOR): 1.022, 95% confidence interval (CI): 1.003–1.042, p = 0.020) and consolidation scores up to 9 days from onset of symptoms (0–2 days: aOR: 1.025, 95% CI: 1.006–1.045, p = 0.010; 3–5 days: aOR: 1.03 95% CI: 1.011–1.051, p = 0.002; 6–9 days: aOR; 1.052, 95% CI: 1.015–1.089, p = 0.005) were associated with an unfavorable corticosteroid response. AI-generated scores could help intervene in the use of corticosteroids in COVID-19 patients who would not benefit from them.


Introduction
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) causes COVID-19, a pandemic that has affected the lives of 766 million individuals worldwide [1].Efforts have been made to mitigate the detrimental effect of this disease, and corticosteroids, a type of immune modulator, have played a pivotal role in reducing mortality rates, as demonstrated in large-scale randomized controlled trials [2][3][4][5].The mechanism involved in steroid responsiveness lies in its ability to reduce hyperimmune activation triggered by SARS-CoV-2 [6,7].However, determining and predicting the treatment response to corticosteroids is complicated, making it challenging to identify individuals who will benefit the most from this therapy.These difficulties led to the establishment of criteria for escalating immunomodulator therapy based solely on clinical observation of hypoxia exacerbation [8].To avoid cases refractory to corticosteroids or rebound phenomena during steroid reduction or after discontinuation, additional methods for predicting corticosteroid responsiveness are required [2,9,10].
The pathophysiologic mechanism of tissue tropism of SARS-CoV-2 through angiotensinconverting enzyme 2 receptor, which damages alveolar epithelial and capillary endothelial cells by an immune reaction, suggests that imaging modality could be used to predict the prognosis of COVID-19 patients [11][12][13].A study by Liang et al. highlighted the utility of a scoring system that includes a chest radiograph (CXR) as a factor to predict the prognosis of COVID-19 patients [14], while D'Cruz et al. presented opposing views regarding its role [15].The discrepant results might stem from the absence of standardized measurements of CXR findings that are precise and can be quantified.
The shortcomings of imaging modalities are expected to be averted with the help of deep learning algorithms applied to chest imaging.The role of artificial intelligence (AI)-assisted algorithms in diagnosing and predicting the prognosis of COVID-19 has been widely tested and validated recently [16][17][18][19].Further usage of this technology in identifying COVID-19 patients with unfavorable corticosteroid response by monitoring AI-based changes in CXR findings is anticipated and deserves further investigation.
Our institution introduced an AI-assisted CXR imaging technology tested and validated in other studies [20][21][22].This software helps to detect various lesions and provides an abnormality score for each CXR a patient had taken.We aimed to navigate the utility of an AI-generated CXR abnormality (AI-CXR) score in predicting the outcome of patients hospitalized for COVID-19 and treated with corticosteroids.

Study Design and Population
This retrospective case-control study was conducted in a university-affiliated, 500-bed hospital in South Korea.We enrolled mild-to-moderate COVID-19 patients treated with corticosteroids from 4 September 2021, to 30 August 2022.This institution was designated to provide care for mild-to-moderate COVID-19 patients who need hospitalization.Patients whose condition deteriorated and required mechanical ventilation were transferred to other hospitals dedicated to taking care of critically ill patients.Hospitalized patients were treated according to the National Institutes of Health's COVID-19 Treatment Guidelines [8], except in the early phase of the pandemic when proper treatment guidelines had not been established.Corticosteroids were the most commonly prescribed drugs in the early phase of the pandemic due to easy accessibility in healthcare settings.Enrolled patients were followed up until discharge, and the last follow-up date of the last patient was 6 October 2022.Patients were enrolled according to the following criteria: (1) hospitalized with acute COVID-19 infection confirmed using real-time polymerase chain reaction tests and (2) a history of corticosteroid use of an equivalent dose of dexamethasone 6 mg or less during the SARS-CoV-2 infection regardless of type or date of initiation.
Any patient meeting the following conditions was excluded from the study: (1) under the age of 19 years; (2) without CXR results; and (3) corticosteroid use exceeding an equivalent dose of dexamethasone 6 mg.
The primary endpoint of corticosteroid unresponsiveness was defined as a deterioration of the patient's condition manifested by the advancement of two or more in the World Health Organization eight-categories-ordinal scale (Table S1) at the time of discharge, or no improvement of a condition if the patient was initially categorized in the 5th category or worse at the time of COVID-19 confirmation.

Data Collection
The data of participants were collected retrospectively by reviewing electronic medical records.Age, sex, underlying condition (diabetes mellitus (DM), chronic obstructive pulmonary disease (COPD), history of myocardial infarction, chronic heart failure, periph-eral vascular disease, chronic kidney disease (CKD), chronic liver disease, malignancy of solid organs, leukemia, lymphoma, cerebral vascular disease, dementia, connective tissue disease, peptic ulcer disease, hemiplegia, or human immunodeficiency virus infection), Charlson comorbidity index (CCI), and immunocompromised status determined by Centers for Disease Control and Prevention criteria [23] were recorded.Treatment, history of vaccination, types, and duration of antiviral agents, antibacterial agents, and corticosteroids were reviewed.Laboratory values such as white blood cell count (WBC (10 3 /µL), platelet count (10 3 /µL), lymphocyte percentage (%), C-reactive protein (CRP, mg/L), D-dimer (mcgFEU/mL), Interleukin (IL)-6 (pg/mL), albumin (g/dL), and procalcitonin (PCT, ng/mL) were collected.CXR results were procured as described under the next subheading.CXR and laboratory results were chosen based on the date of onset of symptoms to assess the response according to the course of the disease.Serial results were obtained according to the following categories: (1) 0: 0-2 days from the event; (2) 1: 3-5 days from the event; (3) 2: 6-9 days from the event; and (4) 3: more than 10 days from the event.A single result in each category was included in the analysis.

AI-Based CXR Results
All CXRs were obtained in anteroposterior projection in each patient's room, as mandated by hospital policy for patients with highly contagious diseases.A commercially available AI-based lesion detection software (Lunit INSIGHT CXR, version 3, Lunit Inc., Seoul, Republic of Korea) was used to obtain the AI-CXR score of lung lesions.This software used certified convoluted neural network architecture in its development and is capable of detecting a total of eight lesions on CXRs, including pulmonary nodule, consolidation, pneumothorax, fibrosis, atelectasis, cardiomegaly, pleural effusion, and pneumoperitoneum [24,25].Since consolidation and pleural effusion were known to be associated with COVID-19 pneumonia, we extracted consolidation and pleural effusion AI-CXR scores from the AI server, which were integrated into all CXRs taken throughout hospitalization.The abnormality score by the AI software is presented as a percentage ranging from 0 to 100%, which indicates the AI-decided probability of CXR having the lesion.Our hospital used a cutoff value of 15% for the abnormality score to decide the presence of the lesion according to vendors and another study [26].Using this cutoff value, this software determines that the lesion is present on the CXR and displays a contour map along with the abnormality score as described in Figure S1.

Statistical Analysis
Participants with favorable and unfavorable corticosteroid responsiveness were compared.Baseline characteristics were compared using Mann-Whitney U test, independent samples t-test for continuous variables, and χ 2 test or Fisher's exact test for categorical variables.Continuous variables are expressed as means ± standard deviation, or medians (interquartile ranges) and categorical variables as numbers with percentages for the description of baseline characteristics.A generalized estimating equation model with logit links was used to analyze whether repeated-measured CXR results and laboratory data influenced the primary outcome.Univariate and multivariate logistic regression tests were performed to determine the change in the performance of the fitted model for each time category.Covariates for the multivariable logistic model were chosen based on p-value < 0.05 in a univariate analysis and clinical significance.Additionally, subgroup analysis involving patients with hypoxia and categorized according to the date of COVID-19 confirmation was conducted using a model that included AI-CXR score as a predictor.The association of the AI-CXR score with other biomarkers was estimated using linear regression analysis.A p-value < 0.05 was considered statistically significant.Cases with missing values in any category were excluded from the analysis.The prediction accuracy of the AI-CXR score was assessed using the area under the receiver operating characteristic (ROC) curve.For the statistical analysis, we used R (version 4.2.2,Foundation for Statistical Computing, Vienna, Austria) and SPSS (version 26.0,IBM Corp., Armonk, NY, USA).

AI-CXR Score as a Factor Associated with Unfavorable Corticosteroid Response
The pleural effusion score in category 2 (adjusted odds ratio (aOR) 1.022, 95% confidence interval (CI) 1.003-1.042,p = 0.02) and consolidation score in category 0-2 (category 0: aOR 1.025, 95% CI 1.006-1.045,p = 0.01; category 1: aOR 1.03 95% CI 1.011-1.051,p < 0.01; category 2: aOR 1.052, 95% CI 1.015-1.089,p < 0.01) were associated with an unfavorable outcome (Table 2).A box plot of the AI-CXR score according to the endpoint and time category is shown in Figure S2.The prediction accuracy of the AI-CXR score was estimated using ROC curve analysis.The area under the curve for consolidation score ranged from 0.739 to 0.855 and that of pleural effusion ranged from 0.692 to 0.809 and, hence, has a significant power to predict the outcome of unfavorable corticosteroid response (Figure 2).Values with statistical significance of p < 0.05 were presented with bold type.Abbreviations: OR, odds ratio; aOR, adjusted odds ratio; CI, confidence interval.* OR was calculated using a generalized estimating equation for all measurements involved or logistic regression analysis in categorical measurements.† aOR was adjusted for age, sex, Charlson comorbidity index, immune status, vaccination status, antiviral agent usage, and antibacterial agent usage.‡ 0-2 days from the onset of symptoms.§ 3-5 days from the onset of symptoms.6-9 days from the onset of symptoms.¶ More than 10 days from the onset of symptoms.

Association between AI-CXR Scores and Other Laboratory Tests Correlated with Unfavorable Corticosteroid Response
High CRP level was associated with unfavorable corticosteroid response across all time categories.Low lymphocyte percentages also differed between unfavorable and favorable corticosteroid response groups, but only in category 2 (aOR 0.914, 95% CI 0.851-0.982,p = 0.01) and category 3 (aOR 0.857, 95% CI 0.752-0.970,p = 0.02) in grouping (Table S2).The differences in values between the two groups are presented in Table S3.The results of the subgroup analysis involving patients with conditions of concern are presented in Figures S3 and S4.Consolidation scores remained relevant in predicting corticosteroid responsiveness in patients with hypoxia and patients diagnosed in the Delta variant-dominant period.Pleural effusion score was associated with the outcome in the Omicron variant-dominant period.

Association between AI-CXR Scores and Other Laboratory Tests Correlated with Unfavorable Corticosteroid Response
High CRP level was associated with unfavorable corticosteroid response across all time categories.Low lymphocyte percentages also differed between unfavorable and favorable corticosteroid response groups, but only in category 2 (aOR 0.914, 95% CI 0.851-0.982,p = 0.01) and category 3 (aOR 0.857, 95% CI 0.752-0.970,p = 0.02) in grouping (Table S2).
The differences in values between the two groups are presented in Table S3.
Regarding variables that had a linear correlation with the AI-CXR score, CRP, albumin, and lymphocyte percentage showed a close correlation across all time categories with both consolidation and pleural effusion scores.The extent of correlation is expressed as a parameter estimate (Table 3).The extent of association was presented as a parameter estimate, which was calculated using a linear regression.
Values with statistical significance of p < 0.05 were presented with bold type.Abbreviations: WBC, white blood cell count; PLT, platelet count; CRP, c-reactive protein; IL-6, interleukin six; NA, not applicable.* 0-2 days from the symptom onset.† 3-5 days from the symptom onset.‡ 6-9 days from the symptom onset.§ More than 10 days from the symptom onset.

Discussion
The findings of this study suggested that AI-based software applied to CXR could predict treatment response by utilizing previously underrecognized factors.The abnormality scores of pleural effusion and consolidation, generated by a commercially available AI software, demonstrated good predictive performance for corticosteroid responsiveness in COVID-19 patients.The positive correlation between AI-CXR scores and other biomarkers associated with an unfavorable response suggested the reliability of this technology.
To select patients who would benefit the most from corticosteroid treatment, risk factors associated with adverse outcomes need to be investigated.Previous studies have focused on laboratory results and underlying conditions instead of imaging findings.Murakami et al. proposed severe respiratory failure and high-soluble IL-2 receptor, lactate dehydrogenase, and CRP levels as factors associated with adverse outcomes [27].A study using deep learning algorithms in predicting corticosteroid responsiveness also included laboratory results such as lymphocyte percentage, PCT, and tumor necrosis factor α, IL-1β, IL-2 receptor, IL-6, IL-8, IL-10, and CRP levels [28].Our study is different from these studies in that we attempted to use CXR imaging as the main tool for outcome prediction.Considering the pathogenesis of COVID-19 and the mechanism of action of corticosteroids, the use of an imaging modality is the better option for assessing corticosteroid responsiveness.
Our study shows that quantified scores presented by AI systems could be used to predict corticosteroid responsiveness.
The efficacy of corticosteroids is dependent on their ability to reduce cytokines by suppressing inflammatory cells involved in adaptive immunity.This would prevent alveolar damage triggered by the reaction, which is likely to be detected by the imaging modality [29][30][31][32][33][34].Recent studies involving AI-assisted image analysis shed light on the utilization of the technique in diagnosing and predicting the prognosis of COVID-19 by quantifying opacification of the alveolar system or interstitial tissue and precise localization with augmentation [19,35].These characteristics enable the implementation of a simple imaging modality such as CXR in a setting where a complex imaging technique is unavailable due to the high risk of disease exposure.In this study, we showed that quantified scores indicating the probability, based on a simple modality as CXR, can be used to identify patients with poor responses.Since consolidation is a frequently observed finding in COVID-19associated pneumonia, typically appearing around 6-7 days after the onset of symptoms, it is understandable that there is an association between the consolidation score within 10 days after the onset of symptoms and the treatment outcome [36].This is also consistent with the recent report of an association of between the extent of pneumonia in CXR and poor outcomes [13].Notably, the gap in pleural effusion score widened approximately a week after symptom onset, when viral replication usually phases out with hyper-immune activation phasing in.Pleural effusion is not directly associated in COVID-19; however, it could be associated with hyper-immune activation such as multi-system inflammatory syndrome through endothelial damage [37,38].Our study presents tangible evidence of the proposed mechanism of corticosteroid action.Therefore, we can interpret our finding as an aberrant immune reaction expressed as consolidation at an early stage that led to pleural effusion at a later stage that could not be slowed down through corticosteroid administration, resulting in poor treatment response.Considering that disease progression is associated with hyperimmune activation, patients with consolidation at approximately one week from the onset of symptoms and unimproved pleural effusion should be considered for preliminary therapy with other immune-modulating agents.
Our results indicate that AI-CXR scores linearly correlated with other biomarkers associated with unfavorable outcomes.This would help resolve questions related to AI and its implementation in clinical practice.Consistent with previously identified prognostic factors of COVID-19 [27,39,40], CRP level and lymphocyte percentage were associated with unfavorable treatment responses in this analysis.AI-CXR scores had a linear correlation with high CRP, low lymphocyte percentages, and low albumin levels.The association between high AI-CXR scores to these variables suggests that AI-CXR scores can be used to describe disease severity.AI-CXR scores showed a similar significance in subgroup analysis including only patients who required oxygen therapy, except for pleural effusions score in category 2. This might be attributed to the small sample size.Further studies with a larger number of participants are required to verify whether the AI-CXR score can be reliably applied to those who need corticosteroid treatment as the current guidelines stipulate.AI-CXR scores had a significant effect on the outcome in both the Delta and Omicron variant-dominant periods.In South Korea, the Omicron variant gained dominance by the first week of January 2022.It is noteworthy that different CXR findings were significant in terms of corticosteroid responsiveness during each period.In the Delta variant-dominant period, consolidation scores were associated with unfavorable corticosteroid response, while pleural effusion scores predicted it in the Omicron variant-dominant period.The Omicron variant is known for atypical presentation on chest CT compared to the Delta variant, despite its reduced virulence [41].This might indicate that the Omicron strain could still be a threat to patients with weakened immune systems by replication of similar pathophysiologic damage to a patient's respiratory system.Application of AI-CXR scores to other pathogenic organisms is expected.
This study has some limitations.First, most of the patients in the unfavorable group had been transferred due to critical conditions; therefore, their outcomes are not known, and thus, the results of the unfavorable group may have been overestimated.Second, because of the retrospective design and small sample size, missing values in laboratory and CXR tests could have affected the statistical power of this study.However, we believe our results are important because we obtained statistical significance when we excluded the missing data due to early transfer or discharge from the analysis instead of using data imputation.Third, the diagnostic accuracy of AI-based software was not evaluated according to the radiologists' reports or CT scans because this was out of the scope of our study.However, this software is already well-known for its high diagnostic performance in other studies [42,43] and we attempted to investigate the robustness of the AI-CXR score by examining the correlation with other biomarkers instead.Fourth, we used scores showing the possibility of the presence of the lesions presented by AI for the analysis.It is debatable whether abnormality scores presented by AI can constitute an absolute quantitative value to represent disease extent or severity, unlike other quantitative imaging markers, due to the undefinable characteristics of AI itself.However, we assumed that increased area or opacity on CXR could increase the possibility of prediction by AI and decided to set this value for monitoring treatment response throughout this study.

Conclusions
This study demonstrated a negative correlation between corticosteroid response and AI-generated pleural effusion scores obtained approximately a week later, as well as consolidation scores during the early stage of the onset of symptoms.Patients with signs of poor response should be considered for pre-treatment with other immune-modulating agents.Further validation of the technology involving patients with different disease entities is warranted.

Supplementary Materials:
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/jcm12185852/s1,Table S1: Eight-categories-ordinal scale of the World Health Organization; Table S2: Association of laboratory tests with unfavorable corticosteroid response according to time category; Table S3: The differences in laboratory values according to corticosteroid responsiveness; Figure S1: Examples of artificial intelligence-generated abnormality scores incorporated in patients' chest radiographs; Figure S2: Differences in artificial intelligencegenerated chest radiograph scores according to corticosteroid responsiveness and time category; Figure S3: Association between consolidation score and unfavorable corticosteroid responsiveness according to the time category and subgroup; Figure S4: Association between pleural effusion score and unfavorable corticosteroid responsiveness according to the time category and subgroups.Reference [44] is cited in Supplementary Materials.
Funding: This study was supported by a faculty research grant from Yonsei University College of Medicine (6-2022-0083).
Institutional Review Board Statement: This study was approved by the Institutional Review Board of Yonsei University Health System Clinical Trial Centre, and the study protocol adhered to the tenets of the Declaration of Helsinki (approval number 9-2022-0187, approved on 27 January 2022).

Informed Consent Statement:
As this was a retrospective study, the Institutional Review Board waived the requirement for written informed consent from the participants.

Figure 1 .
Figure 1.Flow chart of patient enrollment in the analysis.

Figure 1 .
Figure 1.Flow chart of patient enrollment in the analysis.

Figure 2 .
Figure 2. Receiver operating characteristic (ROC) curve of artificial intelligence-generated chest radiograph (AI-CXR) scores for predicting corticosteroid responsiveness.(A) ROC curve of the consolidation score.(B) ROC curve of the pleural effusion score.ROC curves of sequential AI-CXR score were drawn according to time category.Abbreviations: CAT 0: category 0, 0-2 days from the onset of symptoms; CAT 1: category 1, 3-5 days from the onset of symptoms; CAT 2: category 3, 6-9 days from the onset of symptoms; CAT 3, more than 10 days from the onset of symptoms; AUC, area under the curve.

Figure 2 .
Figure 2. Receiver operating characteristic (ROC) curve of artificial intelligence-generated chest radiograph (AI-CXR) scores for predicting corticosteroid responsiveness.(A) ROC curve of the consolidation score.(B) ROC curve of the pleural effusion score.ROC curves of sequential AI-CXR score were drawn according to time category.Abbreviations: CAT 0: category 0, 0-2 days from the onset of symptoms; CAT 1: category 1, 3-5 days from the onset of symptoms; CAT 2: category 3, 6-9 days from the onset of symptoms; CAT 3, more than 10 days from the onset of symptoms; AUC, area under the curve.

Table 1 .
Baseline characteristics of the hospitalized COVID-19 patients treated with corticosteroids.
Data are expressed as mean ± standard deviation, median [Q1-Q3], or number with percentages.Abbreviations: DM, diabetes mellitus; COPD, chronic obstructive pulmonary disease; CHF, chronic heart failure; CKD, chronic kidney disease; Dz., disease; CCI, Charlson comorbidity index.* Unfavorable corticosteroid responsiveness was defined as either advancement of two or more of the eight-categories-ordinal scale established by the World Health Organization or no improvement from the initial 5th or worse category.† p-value was calculated using Fisher's exact test; ‡ Days between steroid initiation and COVID-19 confirmation.

Table 2 .
Association of artificial intelligence-generated chest radiograph abnormality score with unfavorable corticosteroid response according to time category.

Table 3 .
The relationship of artificial intelligence-generated chest radiograph abnormality score with other laboratory variables.